SpamCooling: A Parallel Heterogeneous Ensemble Spam Filtering System Based on Active Learning Techniques
نویسندگان
چکیده
Anti-spam technology is developing rapidly in recent years. With the emerging applications of machine learning in diverse fields, researchers as well as manufacturers around the world have attempted a large number of related algorithms to prevent spam. In this paper, we designed an effective anti-spam protection system, SpamCooling, based on the mechanism of active learning and parallel heterogeneous ensemble learning techniques. The system adopts a batch method to filter spam and can be easily incorporated with existing mail clients (MUA). It can actively obtain user feedbacks for providing users with personalized spam filtering experiences. The parallel heterogeneous ensemble method can help system achieve high spam detection rate as well as low ham misclassification rate.
منابع مشابه
Application of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملA Comparison of Ensemble and Case-Base Maintenance Techniques for Handling Concept Drift in Spam Filtering
The problem of concept drift has recently received considerable attention in machine learning research. One important practical problem where concept drift needs to be addressed is spam filtering. The literature on concept drift shows that among the most promising approaches are ensembles and a variety of techniques for ensemble construction has been proposed. In this paper we consider an alter...
متن کاملEvaluating ensemble classifiers for spam filtering
In this study, the ensemble classifier presented by Caruana, Niculescu-Mizil, Crew & Ksikes (2004) is investigated. Their ensemble approach generates thousands of models using a variety of machine learning algorithms and uses a forward stepwise selection to build robust ensembles that can be optimised to an arbitrary metric. On average, the resulting ensemble out-performs the best individual ma...
متن کاملSingle-Class Learning for Spam Filtering: An Ensemble Approach
Spam, also known as Unsolicited Commercial Email (UCE), has been an increasingly annoying problem to individuals and organizations. Most of prior research formulated spam filtering as a classical text categorization task, in which training examples must include both spam emails (positive examples) and legitimate mails (negatives). However, in many spam filtering scenarios, obtaining legitimate ...
متن کاملA Machine Learning Approach to Server-side
Spam-detection systems based on traditional methods have several obvious disadvantages like low detection rate, necessity of regular knowledge bases’ updates, impersonal filtering rules. New intelligent methods for spam detection, which use statistical and machine learning algorithms, solve these problems successfully. But these methods are not widespread in spam filtering for enterprise-level ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JCIT
دوره 5 شماره
صفحات -
تاریخ انتشار 2010